Towards Audio-Visual On-line Diarization Of Participants In Group Meetings

نویسندگان

  • Hayley Hung
  • Gerald Friedland
چکیده

We propose a fully automated, unsupervised, and non-intrusive method of identifying the current speaker audio-visually in a group conversation. This is achieved without specialized hardware, user interaction, or prior assignment of microphones to participants. Speakers are identified acoustically using a novel on-line speaker diarization approach. The output is then used to find the corresponding person in a four-camera video stream by approximating individual activity with computationally efficient features. We present results showing the robustness of the association on over 4.5 hours of non-scripted audio-visual meeting data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech/Non-Speech Detection in Meetings from Automatically Extracted low Resolution Visual Features

In this paper we address the problem of estimating who is speaking from automatically extracted low resolution visual cues in group meetings. Traditionally, the task of speech/non-speech detection or speaker diarization tries to nd “who speaks and when” from audio features only. In this paper, we investigate more systematically how speaking status can be estimated from low resolution video We e...

متن کامل

Multimodal Speaker Diarization Utilizing Face Clustering Information

Multimodal clustering/diarization tries to answer the question ”who spoke when” by using audio and visual information. Diarization consists of two steps, at first segmentation of the audio information and detection of the speech segments and then clustering of the speech segments to group the speakers. This task has been mainly studied on audiovisual data from meetings, news broadcasts or talk ...

متن کامل

Audio Segmentation for Meetings Speech Processing

Audio Segmentation for Meetings Speech Processing by Kofi Agyeman Boakye Doctor of Philosophy in Engineering—Electrical Engineering and Computer Sciences University of California, Berkeley Professor Nelson Morgan, Chair Perhaps more than any other domain, meetings represent a rich source of content for spoken language research and technology. Two common (and complementary) forms of meeting spee...

متن کامل

Analysis of the No Return Point Hypothesis: The Effect of Audio and Visual Stimuli in the Fast Movements Inhibition

Background. The No Return Point hypothesis is one of the research areas that has been done in line with the motor program. In this hypothesis emphasized an inability to inhibition move after its start by the motor program. Several factors are affecting the mechanism of this inhibition. Objectives. In this study, we investigate the effects of audio and visual stimuli on blocking quick moves to ...

متن کامل

Effectiveness of Media Intervention on Students' Attitudes toward Drug and Tobacco: Based on Health Education and Legal Consequences

  Background and purpose: Drug and tobacco addiction is one of the major threats to adolescents and educating this group could be of great benefit in preventing the problem. The purpose of this study was to investigate the effectiveness of media intervention on students' attitude towards drug and tobacco use. Materials and methods: In this quasi-experimental research a male high school was ra...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008